Overview
Dataset statistics
| Number of variables | 6 |
|---|---|
| Number of observations | 125497040 |
| Missing cells | 21657651 |
| Missing cells (%) | 2.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 3.7 GiB |
| Average record size in memory | 32.0 B |
Variable types
| Numeric | 4 |
|---|---|
| DateTime | 1 |
| Boolean | 1 |
onpromotion is
highly imbalanced (61.5%)
| Imbalance |
onpromotion has
21657651 (17.3%) missing values
| Missing |
unit_sales is
highly skewed (γ1 = 582.2246437)
| Skewed |
id is uniformly
distributed
| Uniform |
id has unique
values
| Unique |
Reproduction
| Analysis started | 2026-01-06 22:34:19.856455 |
|---|---|
| Analysis finished | 2026-01-06 22:42:03.358105 |
| Duration | 7 minutes and 43.5 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
| Distinct | 125497040 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 62748520 |
| Minimum | 0 |
|---|---|
| Maximum | 1.2549704 × 108 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 957.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6274852 |
| Q1 | 31374260 |
| median | 62748520 |
| Q3 | 94122779 |
| 95-th percentile | 1.1922219 × 108 |
| Maximum | 1.2549704 × 108 |
| Range | 1.2549704 × 108 |
| Interquartile range (IQR) | 62748520 |
Descriptive statistics
| Standard deviation | 36227875 |
|---|---|
| Coefficient of variation (CV) | 0.57735028 |
| Kurtosis | -1.2 |
| Mean | 62748520 |
| Median Absolute Deviation (MAD) | 31374260 |
| Skewness | -9.4342576 × 10-17 |
| Sum | 7.8747535 × 1015 |
| Variance | 1.3124589 × 1015 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 2 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| Other values (125497030) | 125497030 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 125497039 | 1 | |
| 125497038 | 1 | |
| 125497037 | 1 | |
| 125497036 | 1 | |
| 125497035 | 1 | |
| 125497034 | 1 | |
| 125497033 | 1 | |
| 125497032 | 1 | |
| 125497031 | 1 | |
| 125497030 | 1 |
date
Date
| Distinct | 1684 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 957.5 MiB |
| Minimum | 2013-01-01 00:00:00 |
|---|---|
| Maximum | 2017-08-15 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
store_nbr
Real number (ℝ)
| Distinct | 54 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.464578 |
| Minimum | 1 |
|---|---|
| Maximum | 54 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 239.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 12 |
| median | 28 |
| Q3 | 43 |
| 95-th percentile | 51 |
| Maximum | 54 |
| Range | 53 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 16.33051 |
|---|---|
| Coefficient of variation (CV) | 0.59460263 |
| Kurtosis | -1.3568904 |
| Mean | 27.464578 |
| Median Absolute Deviation (MAD) | 16 |
| Skewness | -0.074194851 |
| Sum | 3.4467232 × 109 |
| Variance | 266.68557 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 44 | 3513089 | 2.8% |
| 45 | 3484244 | 2.8% |
| 47 | 3457407 | 2.8% |
| 3 | 3401264 | 2.7% |
| 46 | 3353890 | 2.7% |
| 49 | 3342531 | 2.7% |
| 8 | 3261184 | 2.6% |
| 48 | 3236523 | 2.6% |
| 50 | 3192566 | 2.5% |
| 6 | 3089799 | 2.5% |
| Other values (44) | 92164543 |
| Value | Count | Frequency (%) |
| 1 | 2562153 | |
| 2 | 2987840 | |
| 3 | 3401264 | |
| 4 | 2830554 | |
| 5 | 2666691 | |
| 6 | 3089799 | |
| 7 | 2921204 | |
| 8 | 3261184 | |
| 9 | 2773790 | |
| 10 | 1740482 |
| Value | Count | Frequency (%) |
| 54 | 1648867 | |
| 53 | 1938255 | |
| 52 | 290581 | 0.2% |
| 51 | 2960031 | |
| 50 | 3192566 | |
| 49 | 3342531 | |
| 48 | 3236523 | |
| 47 | 3457407 | |
| 46 | 3353890 | |
| 45 | 3484244 |
item_nbr
Real number (ℝ)
| Distinct | 4036 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 972769.15 |
| Minimum | 96995 |
|---|---|
| Maximum | 2127114 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 478.7 MiB |
Quantile statistics
| Minimum | 96995 |
|---|---|
| 5-th percentile | 177395 |
| Q1 | 522383 |
| median | 959500 |
| Q3 | 1354380 |
| 95-th percentile | 1964356 |
| Maximum | 2127114 |
| Range | 2030119 |
| Interquartile range (IQR) | 831997 |
Descriptive statistics
| Standard deviation | 520533.6 |
|---|---|
| Coefficient of variation (CV) | 0.53510496 |
| Kurtosis | -0.78499653 |
| Mean | 972769.15 |
| Median Absolute Deviation (MAD) | 404376 |
| Skewness | 0.21928968 |
| Sum | 1.2207965 × 1014 |
| Variance | 2.7095523 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 502331 | 83475 | 0.1% |
| 314384 | 83450 | 0.1% |
| 364606 | 83308 | 0.1% |
| 265559 | 83047 | 0.1% |
| 559870 | 82513 | 0.1% |
| 1036689 | 82134 | 0.1% |
| 273528 | 82108 | 0.1% |
| 564533 | 82086 | 0.1% |
| 261052 | 81774 | 0.1% |
| 414353 | 81755 | 0.1% |
| Other values (4026) | 124671390 |
| Value | Count | Frequency (%) |
| 96995 | 5229 | < 0.1% |
| 99197 | 4902 | < 0.1% |
| 103501 | 35841 | |
| 103520 | 53175 | |
| 103665 | 50449 | |
| 105574 | 40322 | |
| 105575 | 41311 | |
| 105576 | 39959 | |
| 105577 | 30113 | |
| 105693 | 51730 |
| Value | Count | Frequency (%) |
| 2127114 | 247 | < 0.1% |
| 2126944 | 5 | < 0.1% |
| 2126842 | 12 | < 0.1% |
| 2124052 | 704 | |
| 2123863 | 12 | < 0.1% |
| 2123859 | 10 | < 0.1% |
| 2123839 | 13 | < 0.1% |
| 2123791 | 21 | < 0.1% |
| 2123790 | 8 | < 0.1% |
| 2123775 | 64 | < 0.1% |
| Distinct | 258474 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 8.5548653 |
| Minimum | -15372 |
|---|---|
| Maximum | 89440 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 7795 |
| Negative (%) | < 0.1% |
| Memory size | 957.5 MiB |
Quantile statistics
| Minimum | -15372 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 4 |
| Q3 | 9 |
| 95-th percentile | 29 |
| Maximum | 89440 |
| Range | 104812 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 23.605152 |
|---|---|
| Coefficient of variation (CV) | 2.7592663 |
| Kurtosis | 1796939.4 |
| Mean | 8.5548653 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 582.22464 |
| Sum | 1.0736103 × 109 |
| Variance | 557.20319 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 23444825 | |
| 2 | 17749070 | |
| 3 | 13263841 | |
| 4 | 10216998 | 8.1% |
| 5 | 7958957 | 6.3% |
| 6 | 6423645 | 5.1% |
| 7 | 5078334 | 4.0% |
| 8 | 4163234 | 3.3% |
| 9 | 3403350 | 2.7% |
| 10 | 2879594 | 2.3% |
| Other values (258464) | 30915192 |
| Value | Count | Frequency (%) |
| -15372 | 1 | |
| -10002 | 1 | |
| -4673 | 1 | |
| -3606 | 1 | |
| -3600 | 1 | |
| -3451.363 | 1 | |
| -2487 | 1 | |
| -2400 | 2 | |
| -1943 | 1 | |
| -1806 | 1 |
| Value | Count | Frequency (%) |
| 89440 | 1 | |
| 44142 | 1 | |
| 30000 | 1 | |
| 20748 | 1 | |
| 20000 | 1 | |
| 17146 | 1 | |
| 16000 | 1 | |
| 15375 | 1 | |
| 15000 | 1 | |
| 14483 | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 21657651 |
| Missing (%) | 17.3% |
| Memory size | 239.4 MiB |
| False | |
|---|---|
| True | 7810622 |
| (Missing) |
| Value | Count | Frequency (%) |
| False | 96028767 | |
| True | 7810622 | 6.2% |
| (Missing) | 21657651 | 17.3% |
Interactions
Correlations
| id | item_nbr | onpromotion | store_nbr | unit_sales | |
|---|---|---|---|---|---|
| id | 1.000 | 0.302 | 0.152 | 0.023 | -0.050 |
| item_nbr | 0.302 | 1.000 | 0.073 | 0.014 | -0.004 |
| onpromotion | 0.152 | 0.073 | 1.000 | 0.024 | 0.001 |
| store_nbr | 0.023 | 0.014 | 0.024 | 1.000 | 0.079 |
| unit_sales | -0.050 | -0.004 | 0.001 | 0.079 | 1.000 |